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Abstract — Cooperative diversity systems are wireless commu- 
nication systems designed to exploit cooperation among users 
to mitigate the effects of multipath fading. In fairly general 
conditions, it has been shown that these systems can achieve 
the diversity order of an equivalent MISO channel and, if the 
node geometry permits, virtually the same outage probability can 
be achieved as that of the equivalent MISO channel for a wide 
range of applicable SNR. However, much of the prior analysis 
has been performed under the assumption of perfect timing and 
frequency offset synchronization. In this paper, we derive the 
estimation bounds and associated maximum likelihood estimators 
for frequency offset estimation in a cooperative communication 
system. We show the benefit of adaptively tuning the frequency 
of the relay node in order to reduce estimation error at the 
destination. We also derive an efficient estimation algorithm, 
based on the correlation sequence of the data, which has mean 
squared error close to the Cramer-Rao Bound. 



I. Introduction 

Collaborative communication systems employ cooperation 
among nodes in a wireless network to increase data throughput 
and robustness to signal fading. Much of the research done 
in this area has concentrated on information theoretic results, 
protocols, and coding while assuming perfect synchroniza- 
tion [l]-[6]. In this paper, we explore frequency synchro- 
nization of a collaborative system and provide estimation 
bounds and practical algorithms having performance close to 
the bounds. 

In a collaborative system, nodes that would have remained 
silent during some period of time adapt to their surroundings 
and collaborate with the source and destination nodes. These 
systems, sometimes termed cooperative diversity systems, use 
distributed protocols to greatly improve performance over 
traditional point-to-point communication systems. One im- 
provement to system performance comes in the form of added 
robustness to signal fading [1], [2]. An effective way to achieve 
robustness is to increase the spatial diversity by using multiple 
antennas as in a MIMO system [7], [8]. However, when con- 
sidering a network of low-cost wireless devices, the size and 
cost of multiple antennas is prohibitive for these devices [9]. 
A way for low cost nodes to realize much of the benefit of a 
MIMO system is through collaborative (cooperative) diversity. 
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Contract FA8721-05-C-0002. Opinions, interpretations, conclusions and rec- 
ommendations are those of the authors and are not necessarily endorsed by 
the United States Government. 
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Conference on Distributed Computing Systems, June, 2007. 



In fact, in [1] it is shown that a collaborative system can 
have the same diversity order as an equivalent MISO system. 
Employing a collaborative protocol in a wireless network can 
also increase the overall throughput of the network. The use 
of relaying is a special case of network coding and as shown 
in [10], the capacity of a relay (or coded) network is greater 
than in a traditional point-to-point network. 

To design a practical collaborative communication system, 
one of two methods may be used. The signal modulation and 
coding may be designed to be naturally robust to synchro- 
nization errors [11], or alternatively, the frequency and timing 
offsets are estimated and subsequently compensated [12]. We 
explore the second option in this paper. Algorithms and bounds 
for standard synchronization are found in [13]— [15]. The re- 
lated case of a MIMO channel with multiple frequency offsets 
is treated in [16], [17]. In this paper, we provide more details 
and extend the results of [18]. We derive the transmission 
frequency the relay must use to optimally reduce the variance 
of the frequency estimator at the destination by minimizing 
the Cramer-Rao Bound (CRB) of the frequency estimators 
at each receive node. By using the CRB, our frequency 
selection algorithm is independent of algorithm choice. We 
also provide an efficient frequency estimation algorithm for 
the collaborative system. 

In [12], Shin et. al. describes a specific protocol, which 
we use in this paper, for collaborative communication with 
synchronization among three nodes: a source, a relay, and 
a destination. The protocol is based on a two-phase trans- 
mission within each frame [1], [4], a listening phase and a 
cooperation phase. Within each phase there is a preamble 
containing synchronization signals. In the listening phase, 
the relay receives and decodes the source's message. During 
the cooperation phase, the relay re-encodes and transmits 
the message cooperatively with the source. This process is 
illustrated in Figure Q] 

The synchronization algorithms in [12] are ad-hoc and 
meant only to serve as a proof-of-concept that synchronization 
is possible with collaborative systems. In this paper, we derive 
the CRB for optimal frequency offset estimation for the class 
of systems discussed above. We show there exists an optimal 
(with respect to minimizing the CRB) frequency of transmis- 
sion for the relay node based on: 1) the accuracy of estimation 
during the listening phase and 2) the SNR of all node pairs. 
We derive the maximum-likelihood (ML) frequency estimators 
for each receive node. These estimators are asymptotically 
efficient, meaning they achieve the CRB at high signal-to-noise 
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Fig. 1 . Illustration of the two phases in a three node cooperative communi- 
cation system. 



ratio (SNR). However, the ML solution is computationally 
expensive and we therefore derive a practical correlation 
based estimation algorithm with performance close to the 
CRB. For the purposes of this paper, we assume a frequency 
selective fading model and that timing synchronization has 
been performed. Future papers will extend this work to include 
timing estimation and synchronization. We also assume all 
training sequences are constant modulus signals. 

This paper is organized as follows, Section [TT] outlines the 
mathematical model describing the signals involved in the 
frequency estimation portion of each phase. The CRB and ML 
estimators are derived in Sections [III] and [IV] for the listening 
and cooperation phases respectively. Section IVl provides some 
simulation results to illustrate the behavior and performance 
of frequency estimation in the three node relay system while 
Section [Vll shows the mean squared error (MSE) performance 
of each algorithm as compared with the CRB. 

The following notation is used throughout: italic letters 
(x) represents scalar quantities, bold lowercase letters (x) 
represent vectors, bold uppercase letters (A) represent matri- 
ces, (-) T denotes transpose, ( T ) denotes complex conjugation, 
(■) H = ( T ) T denotes complex conjugate transpose, ||-|| denotes 
the 2-norm of a vector, Sft(-) denotes the real part of a complex 
number, E w (-) denotes the expectation operator with respect 
to the random variable w, W(/i, a 2 ) represents the Gaussian 
distribution with mean \i and variance a 2 and Cj\f(n, <J 2 ) 
represents the circularly symmetric complex Gaussian distri- 
bution, i.e., where the real and imaginary parts are independent 
and identically distributed Gaussian random variables with 
variance a 2 /2. 



II. System Model 

The system model is defined in this section. During each 
phase, a preamble consisting of a certain number of samples 
(Ng for listening and N c for cooperation) used for frequency 
synchronization. We assume the transmission channel is fre- 
quency selective with channel impulse response P samples 
long. Due to differences in local oscillator characteristics, the 
operating frequency of each node is slightly different. Let f s 
denote the operating frequency of the source node and similar 
definitions for f r and f d . The notation sd is used to denote 
the source to destination link and likewise for sr and rd. As 
link sd is used in each phase, let sdg denote the link during 
the listening phase and sd c be for the cooperation phase. 

Each transmitted signal is received and converted to base- 
band for subsequent processing. During the listening phase, 
the baseband signal of link a G {sdg, sr} is expressed as [15] 

y a [n]=e^ n s a [n}+w b [n], (1) 

where n is the sample index, f a is the frequency offset 
between the two nodes of link a normalized by the sample 
rate, Wb [n] is the noise generated in the electronics of receiver 
b e {d,r} (destination or relay node respectively), and s a [n] 
is the combination of the known training signals (x^ = 
[x^fO], . . . , xi[Ni — 1]] T ) and the effects of the frequency 
selective channel, given by 

p-i 

s a[h] = ^2 h a [k]x £ [n - k]. 

In this equation, h a [n] are the samples of the channel response 
for link a and P is the duration of the channel response. We 
assume, for each link a, the length of the channel P is the 
same. Writing (Q~|i in matrix form gives 

y„ = V /o X^h a + w 6 (2) 

where [V/ a ] re n = e- y2 ' 7r -' a " is a diagonal matrix and [X^ ^ = 
xg[i — k] is a Toeplitz matrix where xg[k] = for k < and 

k > N t . 

In the cooperation phase, the signal is defined as follows, 

yc = V /sd X sdc h sdc + V frd X rd h rd + w d , (3) 

where we assume the frequency f rd is constant over both 
phases. For each receiver, b, the noise is assumed to be a zero- 
mean circularly symmetric complex Gaussian random vector 

w b ~CN{Q,(j 2 b l). 

In the general case, the frequency offsets between nodes 
can take on any values within the Doppler spread of the 
system plus the frequency differences of the local oscillators. 
We assume the maximum frequency offset is bounded and 
use this information to calculated the CRB and ML frequency 
estimators. In the remainder of the paper, we assume the nodes 
are stationary and thus the signals have no Doppler spread. 
A statistical model for the frequency offset is used as prior 
information to aid in frequency estimation. Let the operating 
frequency of each node m <E {r, s, d} be modeled as 

fm Jo ~t~ <7m: 
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where f a is the mean operating frequency and q m is a random 
variable with mean zero and variance a 2 ^. We assume the 
random variables q m are independent. For this paper, we also 
assume a 2 n = a 2 for all nodes m, which is an appropriate 
model when considering a group of identical nodes cooper- 
ating together. The frequency offsets to be estimated are the 
difference between two of these independent random variables 
and thus the frequencies, f a for a G {sd, sr,rd}, have mean 
zero, variance 2a 2 , and are correlated. 

III. Listening Phase 

In the listening phase, the destination and the relay receive 
the same signal through two different channels. We drop the 
subscript a when considering only the single node-to-node 
link. To derive a good estimator for the frequency, it is useful 
to know the distribution of q m . However, this is not known, 
so it is reasonable to design an estimator based on the "worst 
case" distribution constrained to the known statistics, i.e., a 
mini-max estimator. As frequency estimation is inherently 
non-linear, an asymptotic analysis is performed in the high 
SNR regime (i.e., SNR^> 1). Under this assumption, the 
variance of a ML or maximum a posteriori (MAP) estimator 
is equal to the CRB. In the remainder of this section, we show 
that a Gaussian distribution with mean zero and variance a 2 
for q m maximizes the CRB of the frequency estimate over all 
distributions with the same mean and variance. We then derive 
the MAP estimator of /. 

A. Cramer-Rao Bound 

The unknown parameters in the single node-pair model (HJ 
are / (which is modeled as a random variable with mean 
zero and variance 2a 2 ) and hQ The CRB is defined to 
be the diagonal entries of the inverse Fisher Information 
Matrix (FIM). When one or more parameters are random 
variables, the FIM is expressed in the following form [19] 

3 =E f (Je]f)+3f, (4) 
where the expectation is taken over the random variable /, 

/ d 2 



>e\f 



V d6d6 T 



L(y\f) 



is the standard (non-random parameter) FIM, with expectation 
over the noise distribution, and L(y\, /) oc ^r||y — V/Xh|| 2 
is the log-likelihood of the data vector when the values of h 
and / are held constant. The matrix J/ is defined as follows: 

/ d 2 



-E f 



V d6d6 T 



L(f) 



where L(f) = \ogp(f) and p(f) is the distribution function 
of the random variable /. For the parameter vector 9 T = 
[/ h T h T ], the FIM has the following form [20] 



A A 

A T 
A T E 



S T 







(5) 



1 The parameter at is considered known as it is a property of the receiver 
hardware. Also, the noise variance a 2 is uncoupled with the other parameters 
and is estimated separately with no penalty. 



where A is a scalar in this case. Let [D^Jn,, = In — 1 — Ni 
be a diagonal matrix such that 



d 

The submatrices of (0 are computed as 

2tt 2 



(6) 



D/Xh| 



A = 



-J7T 

a 2 
1 



h*X*D £ X 



r X*X. 



None of these components depend on the random variable / 
and therefore the expectation in goes away. The matrix J / 
is only non-zero in the first element and is 



[J 



/J ii 



d 2 L(f) 
df 2 



= Ff, 



(7) 



where Ff is the Fisher information of the random variable / 
and L(f) is the log-likelihood of /. The CRB for an estimator 
of / is thenJJ^ 1 ]!!, which can be calculated using the Shur 
complement 2 ] [21] to be 

'27T 2 



^B e Xh\\ 2 + Fj 



where 



I - X(X*X) _1 X* is the projection matrix 
onto the space orthogonal to the range of X. As the Fisher 
information is a positive number, it is clear that, to find the 
worst case (maximum) CRB, Ff must be minimized. We use 
the following Lemma to show how this variable is minimized. 

Lemma 1: Let p a (-) represent the family of distributions 
with mean zero and variance a 2 . Let z be a random variable 
distributed as p a {z). The minimum of the Fisher information 
of z, as defined in (01, over the family of distributions with 
variance a 2 is achieved when 

Pa {z) =A/"(0,(t 2 ). 

Proof: Consider the following experiment: without any data, 
design an estimator £ for the random variable z. The log- 
likelihood in this case is L(z) = \ogp a (z). If z = 0, then this 
estimator is unbiased and its variance is a 2 . By the Cramer- 
Rao Theorem, 



var(z) = a > 



1 



Therefore, F z > \ with equality being achieved when z ~ 
Af(0,a 2 ). □ 

By Lemma Q] the maximum CRB (over all distributions of / 
with variance 2a 2 ) is 




D/Xh 




(8) 



2 The {1,1} block of a block matrix inverse is [A 1 ]n = (An 
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B. MAP Estimator of frequency 

As a result of the preceding analysis, we use a Gaussian 
prior distribution on / to calculate the MAP estimator. This 
choice of prior represents the least informative prior of all 
distributions with variance 2cr 2 and mean zero. For a particular 
channel gain h, the log-likelihood of the data is 

L(y, f) = lnp(y, /) = Inp(y|/) + lnp(/) 

K -i||y-V / Xh|| 2 + i ^/ 2 . (9) 

The apparent additional factor of two associated with cr 2 is due 
to the fact that / has a real Gaussian distribution as opposed to 
complex (as in the first term above). For any given frequency, 
the maximum of this expression over h is achieved when 

h(/) = (X*X)- x X*V}y. (10) 

To find the MAP estimator of /, we substitute ( TTOt into © 
and minimize the negative, 

/ = argmin j ||PiV /y || 2 + £^/ 2 j . (11) 

We note that as cry goes to infinity (no prior information), the 
estimator (fTTT i is the standard ML frequency estimator [22]. 

IV. Cooperation Phase 

In the cooperation phase, the destination node receives the 
superposition of signals coming from the source and relay. 
Each of these signals is transmitted with a slightly different 
frequency due to system imperfections. The purpose of this 
section is to derive a mini-max estimator for the two frequency 
offsets f s d and f r d. The estimator is mini-max in the sense that 
we design the (asymptotically) minimum variance estimator 
given that the prior distribution on the frequencies maximizes 
the estimator variance. We show there exists an optimal 
transmit frequency for the relay, which reduces the variance 
of frequency estimation at the destination. 

As the relay has an estimate of f sr (which is correlated 
with f s d and f r d) this information is useful in reducing the 
variance of the estimate at the destination. We assume the 
frequency transmitted from the relay is adjusted according to 
the following rule, 



fr.Tx fr If fsr 

= fr - lifsr + e sr ) 



(12) 



where 7 is a parameter to be optimized and e sr = f sr — f sr 
is the estimation error from the listening phase. We choose 
this rule as it is a linear function of the estimate and thus 
analytically tractable. When 7 = 0, no frequency adjustment 
is made (e.g., when the estimate f sr provides no information 
about the source's frequency), and when 7=1, the relay 
transmits its own estimate of the source's frequency (thus 
trusting the estimate to provide all of the information available 
about the source's frequency). We now express the frequency 
difference between the destination and the relay as 



frd fd fr,Tx 

= fsd - (1 - j)fs- 



The two frequencies to be estimated at the destination node 
are f rd and f sd . 

A. Covariance of frequencies 

Before calculating the MAP estimator of f s d and f r d, we 
compute the least informative joint prior distribution. First, 
the covariance matrix of these random variables is found and 
then we show that the joint Gaussian distribution is the least 
informative prior. 

To proceed, we calculate the covariance matrix of f s d, f sr , 
and e sr . The mean of f sd and f sr are zero, E(/ 2 d ) = E(/ 2 r ) = 
2cr 2 and K(f s df sr ) — c 2 - Now consider E(e sr ) (we show 
here that the MAP estimator derived above is asymptotically 
unbiased, i.e., E(e sr ) = for high SNR). Using the definition 
of e sr and (fTTT l. 

&sr fsr ~1~ £, 

t = argmin j||VyPi V /y || 2 + ||/ 2 j . 

By expressing the expectation as 

E(e sr ) = Ey ar (E esr | /ar (£ - f sr \f sr )), 

the conditional expectation E esr |y sr (£|/ sr ) needs to be calcu- 
lated. Continuing the asymptotic analysis, for high SNR, we 
replace y with its mean and obtain 

E«„.|/„(«l/.r)»»rgmm{||V / Pi,V;V / „X t h, r || 2 

+4 /2 J' <i4> 

where the approximation is exact in the limit cr 2 — > 0. We 
perform the change of variables f sr = and / = / — f sr , 
therefore, V? =1. The first term in ( fl4b is 

hj r x2v / p£ < v*-X/iw, 

which is greater than or equal to zero and only equal to zero 
when / = (i.e., f — f sr ). This function is thus locally 
convex about the point / = f sr and therefore locally quadratic. 
The second order Taylor series approximation is 



x 



D,X,h sr || 2 / 2 . 



Q 

The value Q can be considered the effective signal power 
including all system and estimation gains. Returning to ( IT41 ). 



E(£\f„) w argmin <j Q • (/ - f sr ) 2 + ^2 f 



(13) 



Q + K 
Completing the mean of e 

E(e sr ) = E 



fsr- 



(15) 



Q 



Q + K 



fsr fsr I 
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because the mean of f sr is zero and thus the estimator is 
asymptotically unbiased. 

Continuing on with the covariance, 

— 2K 

E(f sr e sr )=E(f sr E(e sr \f sr )) = -—-0} 

Q + K 

and similarly E(/ S( je sr ) = q-\^k °/ where K is defined 
in ( TT5T ). Following a similar argument as above for E(e 2 r ) 
yields the result that the variance of e sr is 



2K -o 2 



Q+K»f> Which 

is equal to the CRB in ((8). Thus (flit is an asymptotically 
efficient estimate of the frequency. In summary, 

2 1 ~ K 
1 5 



Cov(/ sd , f sr ,e sr ) = Of 



Q+K 
-2K 



Q+K 
-K -IK 2K 



Q+K Q+K Q+K 

With this covariance matrix calculated, the covariance of 
f ad and f rd is 



T} A 2 



9 (l+ 7 )Q+g 
z Q+-K" 
(i+ 7 )Q+a o(l-7+7 2 )Q+-g 



Q+A' 



Q+K 



(16) 



B. Cramer-Rao Bound in Cooperative Phase 

Recall the signal models for the cooperation phase (01 and 
the listening phase (f2]i as well as the relation between the two 
frequencies to be estimated f r d and f s d (TT~3T >. The unknown 
parameters are f s d, f r d, hsd c , h-rd, and h s d r For compactness, 
define f = [f s d frd] T ■ The deterministic FIM (Jgif) is a 
(2+6P) x (2 + 6P) matrix with the structure of © where A is 
2x2. Given the frequency random variables, the distributions 
of y c and y s d e are independent and the joint distribution is 
written as 

p(y c , y S d e , f ) = p(y c \i)p(y S d e |f )p(f ) 

and the FIM is written as 

Je = Je|f(y c ) + Je|f(y«dJ + Jf 

The blocks of the matrix J#if (y c ) are 
2tt 2 

A 11iC = — ||D c X sdc h sd J| 2 



A 15 



A 2 2,c 

A 2 i c 



2tt 2 



2tt 5 



D r X r rfh r 



A 2 i, c = 



-J7T 



2 h rd X rd V / rd V / 3 d D cX s d c , 



and zero for terms not listed. The diagonal matrix D c is 
defined similar to D( in © with 7V C replacing N(. 

For data obtained during the listening phase, the matrix 

Je|f(y S dJ is 



An,/ = -^r 



2tt 2 „ 

IjD^X^hsd, I 

o z 
1 



-33, 



k-13,. 



- — X^Xf 

(7 



-J7T' 



and zero for terms not listed. 

To calculate E(J 9 | f ), note that only the (1,2) and (2,1) 
cross terms of the submatrices above (i.e., A12, Hi. 2, A12, . . . ) 
are dependent on the frequencies. In each case, the dependency 
is of the form AVf s(J V/ r( ,B where A and B are deterministic 
matrices or vectors. Looking at the n th term of rf V/ r 



E([V^V /r J„„) = E(e 



where d n = 2n—l — N c . This expectation is just the character- 
istic function of the random variable f r d—fsd evaluated at wd n 
(denoted as $ frd _ fsd (wd n )). Let [M]„„ = $ /rd _ /sd (7rd„) be 
a diagonal matrix, then we replace d V/ rd with M in all 
cross terms of the FIM blocks. The FIM is then expressed as 



FIM = E(J e |f) + J f 



(17) 



where Jf is nonzero only in the upper left 2x2 block and 
this block is equal to Ff , the Fisher information matrix of f s d 
and f r d. Using the Shur complement of the upper left 2x2 
block of (fTTT i. the CRB for the frequencies are the diagonal 
entries of 



2&{AE _1 A*} + F f ) 



(18) 



In the sequel, we desire to make conclusions about the 
performance of the collaborative system based on the derived 
bounds. As the absolute phase of the signal at each node 
is hard to control and cannot be relied on to remain stable 
over time, we find the worst case CRB and use this in 
the subsequent discussion. That is, for h a = h a e^, find 
maximizing the CRB ( fT8l . The resulting expression is 



C f , max = (A - 2abs{AE -1 A*} + F f 



(19) 



"2 X rd X rd 



Sl2, c - H 21,c - -2 X sd c V / sti V / r£ i X r-d 



An 



A 22 , c = 



Ai 2 ,c = 



h*sd X *d D c X sdc 



-2-h rd X, rd D c X r d 



hk x : dc V} sd V iVti D c X rd 



where Ajj = An and 

Effectively, the phase <fr is chosen to maximize magnitude of 
the off-diagonals of the matrix to be inverted in ( fT9l ), which 
in turn maximizes the diagonals of the inverse (the negative 
signs are chosen for the off-diagonal terms because the FIM 
of the prior distribution, as calculated in the next section, also 
has negative off-diagonal terms). 
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C. Distribution of Frequencies 

We now desire to find the distribution of f sd and f rd , which 
maximizes the CRB for a given frequency covariance Rf ( fT6l >. 
In order to do this, we assume the training sequences are 
chosen to provide near optimal performance. Examining (l9\ , 
an ideal set of training sequences would zero out the off- 
diagonal terms in A and also zero out the (AH _1 A*) term. 
Thus for any constant modulus training sequences, the best 
CRB is 



Cf j0p t — ( Aopt — F f ) 



(20) 



where A opt = diag{A}. We show in Section [V] by simula- 
tion, sequences exist where (T% is close to (|20T >. Under the 
assumption of a good set of sequences, the dependence on the 
distribution of f sd and f rd enters only through Ff. We use 
the following lemma to find the distribution maximizing the 
CRB. 

Lemma 2: For A, B, and C positive definite Hermitian 
matrices, if B > C {i.e., B — C is positive definite), then 
(A + C) _1 — (A + B) _1 has positive diagonal entries. 

Proof: By assumption, (A + B) > (A + C), which implies 
(A+C)" 1 > (A+B)" 1 . Thus the difference of the matrices is 
positive definite Hermitian and therefore has positive diagonal 
elements. □ 

To maximize the diagonal elements of the CRB d20l i, Lemma|2] 
implies Ff is as small as possible. Using an argument similar 
to the scalar case of Lemma Q] the Gaussian distribution 



satisfies this requirement and Ff 



The assumptions 



that f s , f r , f d and the estimation error from the listening phase 
e sr are jointly Gaussian is therefore the least informative prior 
given the specified variances and correlations. 

D. Optimal 7 

With the aim of deriving a mini-max estimator, we desire 
to choose 7 in ( fT~2l > to minimize the trace of Cf ( f20b - As this 
expression is not intuitive, it is helpful to consider a flat fading 
model. For flat fading, P = 1 and the terms in the optimal 
CRB © are 



*opt 











and 



Rf 



2r > e(l+-r)S sr +l/<T 2 f 
2rieS ar + l/aj 

_ 2 9; 



2r le (l+ 7 )S sr + l/a 2 f 
2 Ve S sr + l/tT'j 
) 2 ^(l- 7+7 2 )S ar + l/^ 
' 2r)tS sr + l/a 2 



where r/e = §7r Nt(Ng — 1) (similarly for rj c ) and S rd = 
1^e|L is the signal to noise ratio of the source-relay link 
(similarly for S sdc , S sde , and S sr ). 

An exact calculation of the optimal 7 leads to a long, 
complicated expression that depends on the SNR of each link 
and the variance of the frequency oscillators. The expression 
is omitted here as it gives no insight into the problem. Later, 
we show there is minimal loss when 7 is always set to 1. To 
gain some insight into the behavior of 7, consider two limiting 
cases for 7 op t: af — > 00 and 07 — > 0. 




a f (dB rel. sample rate) 



Fig. 2. Plot of optimal 7 as a function of modeled frequency variation. Three 
curves are shown for different values of gain between the source and relay. 
The SNR of the source-destination and relay-destination node pairs are held 
constant at dB. 



1) Large aj, or no prior information: By taking the limit 
of the expression for j op t as <r/ — ► 00, it can be shown that 
lopt — > 1. In this case, the relay transmits at a frequency 
equal to its estimate of the source frequency. By choosing this 
transmit frequency, the operating frequency of the relay f r 
is removed from the estimation procedure as it contains no 
information about the source-destination frequency. 

2) Small af: When af — > (or when 1/aj is much larger 
than any of the link SNRs perhaps due to poor channel SNRs), 
the CRB is minimized when 7 = 1/2. By looking at the MAP 
frequency estimator (fTTT) for this limiting case, the frequency 
estimate is zero. Therefore, no matter what 7 is chosen, the 
relay just transmits at its own frequency. When af is small 
(but not zero), there is still some information in the frequency 
estimate about the source frequency (besides the information 
from the local oscillator model), and by choosing 7 rs 1/2, 
both sources of information are used to select the best transmit 
frequency. 

As as example of the function j pt> Figure [2] shows plots 
of several curves of j op t versus aj. The length of the training 
signal is Ne = N c = 16 and the SNR of the source-destination 
link is —3 dB (combining the listening and cooperation phases, 
the effective SNR is dB). The SNR from relay to destination, 
S r d, is also dB and there is one curve each for S sr G 
{ — 10 dB, dB, 10 dB}. For each curve, the transition 
from -f opt = 1 to -f opt = 1/2 appears to occur roughly when 

These values of aj 
< VeSsd e (left 

half of the plot), the assumed prior knowledge of frequency 
has more weight than the data, whereas when aj: > rj^S s d e , 
the information in the data is more important than the prior 
model. 



aj w (r] c S s d c + mSsdt) or aj « r] c S rd . 
are significant because, for example, when a j 



E. MAP Estimator of f sd and f rd 

To calculate the MAP estimate of f sd and f rd at the 
destination node during the cooperation phase, the covariance 



7 



between these two random variables (fT6] l is needed. Therefore, 
the values of Q and K need to be forwarded to the destination 
node. The log-likelihood of the data at the destination is 

L(y c ,y S d e J) = m P(y c |f) + lnp(y s djf) + K f ) 

k^Wy *d t - v /sd x,h sd j 2 (2i) 

- -i]| yc -X(f)g|| 2 - VR f - x f, 

where X(f) 4 [ V /sd X sdc V /rd X rd ] and g T = 
[ h^ dc J. As before, choose estimates for g and h s <^ 

to maximize the likelihood for any given frequency pair, 

g(f) = (X*X)- 1 X*y c 
h sd€ (/ sd ) = (XlX^)- 1 XlV^ i y 8d ,. 

Substituting these estimates into (l2T~b and minimizing the 
negative to obtain the MAP frequency estimator 



are: mm 
f 



in{l|Pi(f)yc 



II^V /sd y sdf || 2 + 



(22) 



We note the special case of 07 — > oo (which implies 
lopt = 1)- For 7=1, the covariance ( TToT ) needed in the MAP 
estimator simplifies to 



R 



fsdjr. 



2 

2Q+K 
Q+K 



2Q+K 
Q+K 
2 



which has a finite inverse when a/ < oo. However, when 
<jf — > oo, we evaluate the limit of R^ 1 resulting in 

lim R f = CC T ^ = CC T ^l|FiD,X,h sr || 2 



where ( T = [1 — 1] and C/ sr is the CRB of the frequency in 
the source-relay link ^ with 07 = 00. The penalty term (last 
term) of the MAP estimator ( 1221 simplifies to 



^f T R-!f 



2C/. 



ifsd frd) 



Thus the penalty term is a quadratic of the frequency difference 
term normalized by the ratio of error variances (noise power 
over frequency estimation error variance). 

V. Simulations 

In the previous section, we showed the optimal 7 for 
extreme values of 07 is either 1 or 1/2 and when j op t 
approaches 1/2, its effect is small because the frequency 
adjustment is going toward zero. In this section, we show by 
simulation, the penalty for choosing 7 = 1 instead of 7 = j opt 
is usually limited to a few tenths of a decibel. Thus, near 
optimal performance is achieved without communicating any 
of the link SNRs back to the relay for calculation of j pt- 
We also show the existence of training sequences where ( fT~9b 
is close to d20l i. Finally, we show the benefit of letting the 




Source-Destination SNR - S 



sd 



Fig. 3. Plot of the loss in performance caused by binary training sequence as 
opposed to an arbitrary sequence, and when choosing 7 = 1 versus 7 = "fopt ■ 
Relay-destination and source-destination SNRs are the same and source-relay 
SNR is 10 dB higher. 



relay set its transmit frequency based on information received 
during the listening phase. 

We simulate a three node system in a frequency flat envi- 
ronment. In all simulations, we use the SNR of the sd link 
(assuming S s d c = S s d e ) as a reference value. The following 
configuration is considered: let S r d = S s d c and then vary 
the link SNR of the source-relay link relative to S s d e - Let 
Ne = N c . The prior distribution for the operating frequency 
we assume is Gaussian with a variance of —40 dB relative to 
the sample rate (e.g., a 2 parts-per-million variance of a local 
oscillator at 900 MHz with 4.5 MHz sample rate [12]). 

For flat fading channels and constant modulus training 
sequences, it is sufficient to choose = 1 (the vector of all 
ones) and x s d c — 1. A search is performed to find x r d which 
minimizes the CRB (O. For values of N c e {4, 8, 16} 
an exhaustive search over all binary sequences is performed 
(results hold independent of choice between 7 = 7 opt or 
7=1) and for values of N c > 16, a randomized search over 
binary sequences is performed. For each value of N c (up to 
128) the optimal sequence for x r( j has the following structure: 

Sequence Design: Let ai = [1, — 1] T and 



a,, = a, 



n — 1 5 



where a„ is length 2" and is the last column of a Sylvester 
matrix. Then the length N c = 2" optimal sequence is 



*-rd,opt 



a n -i 
Ja„-i 



where J is the exchange matrix which reverses the order of 
elements in the vector it multiplies. 

For the configuration described above, and with S sr = 
10 5 S£ i c , Figure [3] shows the difference between the best 
possible CRB d2Qb (for any constant modulus sequence and 
7 = lopt) and the worst case CRB dT9b using the binary 
sequence shown above and 7=1. The 0.6 dB difference 
for N c = 4 is primarily due to a non-optimal sequence 



X 




Source-destination SNR - S sd (dB) 



Fig. 4. 



Plot of the sum of Cramer-Rao Bounds for / a( j and / r( j. Circle and 
"x"-marks show bound when a'j = —40 dB and 7 = 7 op t, plus marks show 
bound when 7 = 0, and triangles show bound when 7 = and a j = 00 
(the standard frequency bound assuming no prior information). All curves are 
for a length 16 training sequence. 



x r( j, whereas the 0.2 dB difference for other values of N c 
is due to choosing 7 = 1 instead of the optimal value. The 
loss in performance due to a non-optimal sequence decreases 
dramatically as N c increases. These loss values are typical of 
other system configurations as well. The system behavior as a 
function of training sequence illustrates the fact that the CRB 
is insensitive to the selection of these sequences. 

Figure |4] shows the sum of the CRB for the two frequencies 
estimated at the destination node as a function of S s d c - For 
this figure, the SNRs of the source-destination link (S s d) and 
relay-destination link (S r d) are the same. The circle and "x"- 
marks show the CRB when the SNR of the source-relay link 
S sr is, respectively, the same as and 10 dB higher than S s d- 
The plus marks show the CRB when 7 = 0. The difference 
between the plus-marks and the circle and "x"-marks show 
the potential gain in estimation performance by changing the 
relay's transmit frequency (greater benefit when the SNR is 
large). The triangles show the CRB when no prior information 
is used. This shows a great advantage of using a prior model 
when the SNR is low. 



VI. Sub-Optimal Algorithms 

The maximum-likelihood frequency estimator (l22l requires 
a two-dimensional search over the frequency range of interest. 
As this is a computationally expensive approach to estimation, 
we compare the mean squared error (MSE) performance of 
more efficient, sub-optimal estimation algorithms and intro- 
duce a correlation based estimator as the best compromise 
between estimation performance and computational efficiency. 
In the remainder of this section, we describe the use of the 
one-dimensional ML algorithm as applied to the two signal 
case and the correlation algorithm for frequency estimation 
and compare their performance. 



A. One-Dimensional ML 

As a result of the choice of the training sequence, estimation 
of the two frequencies is nearly uncoupled. Therefore, per- 
forming two independent one-dimensional ML searches for the 
frequencies is approximately the same as performing the full 
two-dimensional ML search as required by the ML algorithm. 
Given the data vector y c , the one-dimensional ML estimates 
of the frequencies are 



f rd = argmin 
f sd =axgrrnn-( 



V* 



o~d 
4cr/ 



V 



fYsde I 



V 



(23) 



(24) 



which do not take the correlations between the frequencies into 
account. To improve the estimates ( f23l and d24l . we assume 
the variance of each estimate meets the CRB assuming the 
prior information is uncorrelated for each frequency: 

Cf = (A — 23fJ{AH- 1 A*} + diaglRf}- 1 )" 1 , 

where diag{Rf} is a diagonal matrix consisting of the di- 
agonal entries of Rf (zeroing out the other elements). This 
assumption is valid for high SNR and large N c . Incorporating 
this knowledge with the prior information, the least squares 
estimates of the frequencies are 



fsd,MLl 
frd,MLl 



B. Correlation Method 



Rf (Rf 



fsd 
frd 



(25) 



We first describe a standard correlation frequency estimation 
method as presented in [23] and then provide an extension to 
allow this algorithm to work in the presence of two signals 
with known training sequences. Assuming a single signal in 
the presence of flat fading 



y[n\ 



e j27rfn x\n] 



w n\ 



1 < n < N. 



The estimated autocorrelation sequence of y[n] is 



iY 



R W = 1T^ £ (y[n]x[n})(y[i-k}x[i-k}). 



i=k+l 



The estimate of the frequency is calculated as 




(26) 



where M is a design parameter and the frequency estimate is 
unambiguous if 

l/l< 



1 



M + 1 

Therefore, M trades performance for estimation range. The 
performance of this algorithm d26b is shown in [23] to be close 
to the CRB when AI = N/2. To ensure adequate estimation 
range, the maximum allowed value of M is 12 (corresponding 
to a range of five standard deviations away from the mean of 
the prior). To incorporate the known prior knowledge of the 



9 



frequency variance, the estimate d26l ) is adjusted according to 
the following rule 



2a) 



2a) 



rf 



where cj is the CRB of the frequency estimate with no prior 
information. Let 

/ = p(y,x,CT/) 

be a function that inputs the data vector y, training vector 
x, and prior information, and outputs the frequency estimate 
according to the above algorithm. This algorithm is used 
without modification during the listening phase to calculate 
the estimate f sr = p(y sr , x^, 07). 

For the cooperation phase, there are two signals present and 
the undesired signal acts as interference for the desired signal 
being estimated. The estimates provided by the correlation 
algorithm are 



fsdj 
frd,l 



(27) 
(28) 



=p(y c ,x*d,<7f) 
=p(y c ,x rd ,a f ), 

which exhibit a floor in MSE (see Figure [5j. To improve the 
estimates, we project out the undesired signal in the following 
manner: 



y c.sd 
Yc,rd 



Vi 



lXrd y c 
,x sd y c 



where the frequency estimates in dZTl i and ( l28T l are used to 
calculate the interference signal, which is projected out. The 
correlation algorithm is run a second time to find 

fsd,2 =p(y C ,sd,Xsd,C r f) 
frd.1 =P(yc,rd,Xrd> &f)- 

The final frequency estimates, with all prior information 
accounted for, is calculated similarly to d25l l. 



fsd,t 
frd,( 



= R f (R f + C f ) 



fsd,2 
frd,2 



Figure[5]shows the total MSE (summation of errors from f s d 
and f r d) of the correlation algorithm compared with the CRB 
for N c = 16. The triangle markers denote the performance of 
the algorithm without any adaptation while the circle markers 
denote the performance of the adaptive two-step algorithm 
described above. For lower SNRs, the adaptive algorithm has 
about a 3 dB advantage while the performance difference is 
much greater at higher SNRs (above 15 dB). The performance 
of the adaptive algorithm is near optimal. The slight "bump" 
in performance of the two algorithms at S s d = — 10 dB SNR 
is caused by the interaction of the threshold region (the region 
where the MSE performance breaks away from the CRB) 
and the region dominated by prior information (where the 
algorithms converge to a —34 dB MSE relative to the sample 
rate). 

For the same scenario, Figure [6] compares the three 
estimation algorithms: full (two-dimensional search) maxi- 
mum likelihood (circles), one-dimensional ML ("x"-marks), 
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Fig. 5. Plot of mean squared error of non-adaptive (circles) and adaptive two- 
step (triangles) correlation algorithms. The mean squared error is compared 
with the CRB. 
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Fig. 6. Plot of mean squared error of full (two-dimensional search) 
ML (circles), one-dimensional ML ("x"-marks), and adaptive correlation 
(triangles). The mean squared error is compared with the CRB. 



and the adaptive correlation algorithm (triangles). Each of 
these algorithms approaches the CRB asymptotically in SNR. 
The differences in behavior at lower SNRs is attributed to 
the different algorithms entering their threshold regions at 
different SNRs. A more detailed analysis of this region can 
be carried out using the methods of [24]. 

VII. Conclusions 

In this paper, we have derived the Cramer-Rao bounds 
for frequency offset estimation in a three-node collaborative 
communication system. We have shown through simulation, 
the performance increase obtained by allowing the relay to 
change its transmitting frequency. We have also shown there 
exists an optimal transmit frequency for the relay node based 
on the other link SNRs and the assumed prior knowledge of 
the frequency offsets. However, there is only a small (tenths of 
decibels) penalty if the relay always transmits at its estimate of 



10 



the source frequency. Simulation results also demonstrate the 
existence of binary training sequences that result in very little 
loss as compared with an arbitrary constant modulus sequence. 
We also derived a computationally efficient correlation based 
estimation algorithm that has mean squared error performance 
close to the CRB. 
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